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Abstract 

Sturmian words are infinite binary words with many equivalent definitions: Tliey have a 
minimal factor complexity among all aperiodic sequences; they are balanced sequences (the 
labels and 1 are as evenly distributed as possible) and they can be constructed using a me- 
chanical definition. All this properties make them good candidates for being extremal points in 
scheduling problems over two processors. In this paper, we consider the problem of generalizing 
Sturmian words to trees. The problem is to evenly distribute labels and 1 over infinite trees. 
We show that (strongly) balanced trees exist and can also be constructed using a mechanical 
process as long as the tree is irrational. Such trees also have a minimal factor complexity. 
Therefore they bring the hope that extremal scheduling properties of Sturmian words can be 
extended to such trees, as least partially. Such possible extensions are illustrated by one such 
example. 

Keywords Infinite trees, Sturmian words, Sturmian trees 

1 Introduction 

In scheduling problems with an infinite number of tasks, the optimal strategy may no longer be 
to execute tasks "as soon as possible" but rather "as regularly as possible" . Keeping this in mind, 
let us consider the following question: how to distribute ones and zeros over an infinite sequence 
w = {wn)neN such that the ones (and the zeros) are spread as evenly as possible. In a more formal 
way, the sequence w is balanced if the number of ones in a factor Wi, Wi+i, . . . , of length £, does 
not vary by more than 1, for all i and all £. Such sequences exist and are called Sturmian words 
when they are not periodic. 

Sturmian words are quite fascinating binary sequences: they have many different characteriza- 
tions formulated in terms coming from as many mathematical frameworks, in which they always 
prove very useful. For example, Sturmian words have a geometric description as digitalized straight 
lines and as such have been used in computer visualization (see [13] for a review). They can also be 
defined using an arithmetic characterization using a repetitive rotation on a torus or continued frac- 
tion decompositions. From a combinatorial point of view, yet another characterization of Sturmian 
words is based on the balance between ones and zeros in all factors, as mentioned before. They 
are also used in symbolic dynamic system theory because they are aperiodic words with minimal 
factor complexity or because they have palindromic properties. Most of these equivalences have 
been known since the seminal work in [16]. 

More recently, Sturmian sequences have also been used for optimization purposes: they are 
extreme points of multimodular functions [12, 2]. This has applications is scheduling theory . In 



[11] rather general scheduling problems with two processors are considered. A simple case is the 

following two processor mapping problem. An infinite number of tasks of unit size are to be executed 
over two processors (labeled and 1) with related speeds, vq and vi such that 1/vo + l/vi > 1. The 
tasks are released every time unit. It is shown that an optimal schedule (minimizing the average 
flow time) allocates task i to processor Wi according to a sequence wi,W2,W2, ■ ■ ■ that is Sturmian. 

Another example solved in [10] is the following processor allocation problem: A single processor 
(with unit speed) is used to execute two types of tasks. Tasks of type 1 (resp. 2) are released every 
time unit and arc all of size So (resp. Si). The allocation of the processor to the tasks can be seen 
as a binary sequence Wi,W2,- - ■ saying which task is to be served next. Here also there exists an 
optimal Sturmian. sequence (minimizing the average flow time of all tasks). 

Actually more general scheduling problems are solved by Sturmian sequences. For instance 
of the tasks are released according to a stationary process and the task sizes are also stochastic, 
independent of the release process, then both problems mentioned above are also solved by Sturmian 
sequences. 

A natural extension is to consider the case where more than two processors can be used to execute 
the tasks. This leads to the construction of generalized Sturmian words in several direction. 

The first one is to study words using more than two letters. Billiard sequences in hypcrcubcs 
extent the torus definition of Sturmian sequences while episturmian sequences [3] extend the palin- 
dromic characterization of Sturmian words. Unfortunately, both extensions differ substantially and 
none of them provides an optimal schedule for the k processor mapping problem. 

Another extension is to two dimensions. A complete characterization of two-dimensional non- 
periodic sequences with minimal complexity is given in [6]. Here again the alternative characteri- 
zations arc lost. 

Yet another generalization is to trees [4], where Sturmian trees are defined as infinite binary 
automata such that the number of factors (sub- trees) of size n is n+ 1. The other characterizations 
of Sturmian words arc lost once more. 

Finally, another extension of Sturmian concerns discrete planes. Here, several characterizations 
of Sturmian lines can be extended to discrete planes. Interesting relations between multidimen- 
sional continued fraction decomposition of the normal direction of the plane and the patterns of its 
discretization mimic what happens for Sturmian sequences, [8]. 

The aim of this paper is to do the same for trees. We introduced in [9] a new type of infinite trees: 
unordered trees, for which the left and right children of each node are not distinguishable and gave 
a brief presentation of its main properties. Here, We make an exhaustive study of such trees. We 
show that the balance property (distributing evenly the labels equal to one or zero over the vertex 
of the tree) coincides with a characterization of trees using integer parts of affine functions (called 
mechanicity) . Furthermore these balanced trees have a minimal factor complexity. Therefore, they 
can be seen as a natural extension of Sturmian sequence in more than one aspect. This brings some 
hope to use them as extreme points for adapted optimization problems. 

Our purpose in the paper is two-fold. The first part of the paper is dedicated to the study of 
general unordered infinite trees with binary labels. We provide definitions of the main concepts 
as well as the basic properties of rmordered trees with a special focus on the notion of density 
(the average number of ones) and rationality The second part of the paper investigates balanced 
unordered trees and their properties. In particular we show that strongly balanced trees (defined 
later) are mechanical (so that they have a density and all labels can be constructed in almost 
constant time). Furthermore their factor complexity is minimal among all non-periodic trees. 

We also investigate rational balanced trees by showing that their density is easy to compute and 
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by providing an algorithm with polynomial complexity to test whether a rational tree is strongly 

balanced. Finally, wc show that balanced trees arc extremal points of some convex functions, 
bringing some hope that they can be used to solve optimization problems. 

2 Infinite Trees 

For ordered infinite trees , we follow the presentation given in [4] . Ordered infinite trees are automata 
with an infinite number of states. An automata is a tree- automaton if it has one initial state and 
each state has a uniform in-dcgrcc equal to one (except for the initial state, whose in-degree is 0) 
and a uniform out-degree d with labels ai, • • • , on the arcs. Every node v is labeled by i{v) = 1 
(resp. 0) if it is final (resp. non- final). 

The language accepted by the tree-automaton T is a subset of A* (where the alphabet A = 
{ai, . . . ttd}) and is denoted by C-{T). Thus, a word w in the free monoid A* corresponds to a node 
in T, and a word w in £(T) corresponds to a node in T with label 1. Conversely, a unique tree- 
automaton can be associated to any subset L oi A*, by labeling by one the nodes corresponding to 
the words in L. 

Classically for automata, a family of equivalence relations can be defined over the nodes of tree 

T: w '--^0 M if l{v) = i{u), V u if u ^„ u and for all i, the ith child of ?i, uai and the ith child 

of V, vai satisfy uai ~n vai. By definition of u ~„ v if and only if the subtree rooted in u of 
height n is the same as the subtree rooted in v of height n. 

C{T) is recognized by its minimal deterministic automaton (possibly infinite), say A{T). Ac- 
tually, A{T) can be obtained from the tree T by merging all the states in the tree in the same 
equivalence classes of ~„ for all n. 

An example is given in Figure 1 where the infinite tree-automaton and the minimal automaton 
recognizing all the prefixes of the Fibonacci word (over the alphabet {a, 6}) is given together with 
the corresponding minimal automaton (which has an infinite number of states). 




Figure 1: The tree-automaton recognizing the Fibonacci word and the corresponding minimal 
automaton 

The number of subtrees of size fc in T is called the complexity P{k), of T. P{k) is the number 
of equivalence classes of ^fe. If P{k) < k for at least one k, then it can be shown ([4]) that the 
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complexity is bounded by k. This implies that the minimal automaton A{T) has k states. The tree 
is therefore called rational, since it recognizes a rational language. 

If a tree-automaton T is such that P{k) = + 1 for all k, then it has a minimal complexity 
among all non-rational trees. Such trees have been shown to exist and are called Sturmian in [4] by 
analogy with the factor complexity definition of Sturmian words. In [4] several classes of Sturmian 
tree-automata are presented. However such trees are not balanced and cannot be defined using a 
mechanical construction, as with Sturmian words. 

In the following we rather consider a different type of trees, namely infinite directed graphs with 
labels or 1 on nodes and with uniform in-degree 1 and out-degree d > 2. Here, the children of 
a node arc not ordered. Thus, the main difference with the previous definition is that arcs are 
not labeled. Therefore such trees cannot be bijectively associated with languages. However, it is 
possible to construct a minimal multi-graph (i.e. with multiples arcs) G{T) associated with the 
tree T, mimicking the construction of the minimal automaton for ordered trees. Let us consider a 
family of equivalence relations over the nodes of T: 

V =0 u if u and v have the same label: i{u) = £{v) and 

V =n+i u ii V =n u and the children of ?; arc equivalent (for =„) to the children of u. 
Therefore, v =„ u if and only if the subtree with root v of height n is isomorphic to the subtree 
with root u with height n. By merging the nodes of T when they belong to the same equivalence 
classes, for all n, one gets the minimal multi-graph G{T) of the factors of T: all nodes merged in 
the same vertex of G{T) have the same subtrees of every height. 

In G{T), the node corresponding to the root of T is distinguished, (graphically, this is done by 
adding an arrow pointing to the node). 

There exists a way to associate an ordered tree-automaton T to a tree T by choosing an order 
on the children of each node. This can be done by seeing G{T) as an automaton by labeling arcs 
in G(T) with letters ai,...ad in an arbitrary fashion. Conversely, a tree-automaton T can be 
converted into a graph T by removing the labels on the arcs. This graph is called the unordered 
version of T. 

An example of an unordered tree is given in Figure 2. The label of the black (white) node is 1 
(0). The arcs are implicitly directed from top to bottom. Actually, most figures in this paper will 
represent binary trees (with out-degree = 2), although all the discussion is carried throughout 
for arbitrary degrees. The nodes of the associated multi-graph G{T) are numbered arbitrarily and 
nodes with label 1 are displayed with a bold circle. The node corresponding to the root of the tree 
is pointed by an arrow. This tree can be seen as the tree-automata recognizing the Fibonacci word 
where the labels on the arcs have been removed (there is no longer a left and right child at each 
node). Note that while the minimal automaton is infinite (see Figure 1), the minimal graph G{T) 
is finite, with two nodes, one correspond to the tree where all labels are and one with all labels 
equal to expect on one branch (see Figure 2). 

2.1 Irreducibility and periodicity 

By analogy with Markov chains, a tree T is irreducible if G(T) is strongly connected. Also, an 
irreducible tree T is periodic with period p if the greatest common divisor of the lengths of all 
cycles in G{T) is p. A tree with period 1 is also called aperiodic. 



4 



0^ 



Figure 2: A tree and the associated minimal multi- graph. 



2.2 Factors, complexity and Sturmian trees 

A factor of size n ( and width 1^ is a subgraph of T which is a complete subtree of height n. The 

number of nodes in a factor of size n is denoted by S{n) =^ ^jEr- 

A factor of size n and width k (with root w), is a sub-graph of T which is the subtree of height 
k + n rooted in v minus the subtree of height k, rooted in v. The number of nodes of a factor of 

size n and width k is S{n, k) ^-7*^ • 

Similarly to what as been done for words, the factor complexity Vrin) of a tree T is the number 
of distinct factors of size n and width 1. 

The complexity of a tree Vrin) can be bounded by the total number of ways to label trees of 
height n and degree d, say An- 

It should be clear that = 2 (a node can be labeled or 1) and that An+i = 2M(A„, d) where 
M{x, y) is the number of multisets with y elements taken from a set with x elements. Therefore 
using binomial coefficients, 



yields a new recurrence equation u„+i = dun+Sn where e„ = o(l). This implies that An = (j)"^ •* 
for some (f) with 1 < (j) <2. 

As for lower bounds on the complexity of a tree, it will be shown in Section 3 that trees such 
that Vrin) < n for at least one n are rational, i.e. have a bounded number of factors of any size 
(this means that the minimal multi-graph is finite). 

Therefore, trees T such that G{T) is infinite and with a minimal complexity should satisfy 
Vxin) =n + 1. These trees will be called Sturmian trees by analogy with words. It is not difficult 
to exhibit such trees. For example, starting with a Sturmian word w a binary tree such that all 
nodes on level i have label Wi is Sturmian. 

Another more interesting example is the Dyck tree. The Dyck tree is represented on Figure 3. 
This tree is the unordered version of the tree-automata recognizing the Dyck language (language 
generated by the context-free grammar S — > aSbS\e) and it is not hard to see that this tree is 




This is a polynomial recurrence equation of degree cl. A change of variable, 2l 
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Sturmian. For that, consider the graph G{T) associated with the Dyck tree T, also displayed in 
Figure 3. 

There are two factors of size 1 in T: those with a root labeled 1 (all associated with node in 
G{T)) and those with a root labeled (associated with nodes oo, 1, 2, ■ • • in G{T)). This corresponds 
to the equivalence classes for =i. 

As for factors of size n, all those with a root in node oo and n, n + 1, n + 2 have all their labels 
equal to 0: no path of length n in G{T) reaches the only node with label 1, namely node 0. 

As for the factors starting in node i of G{T) with < i < n, then the first node with label 1 
is at level i + 1. This means that all these factors are distinct. In other words, the equivalence 
classes for =„ are {oo, n, n + 1, . . .}, {0}, {!}, . . . , {n — 1}. The number of distinct factors of size n 
is therefore n + 1. 




Figure 3; The Dyck tree and its minimal graph. 



2.3 Density 

The density of a tree T is meant to capture the average number of 1 in the tree. 

For a node v and a height n > 0, we define the density of the factors of size n with root v by the 
average number of nodes with label 1 in this sub- tree. Let us call dv{n) the density of the factor of 
size nwith root v and let r be the root of the tree T. In the following we will be using four notions 
of density. 

• The rooted density of the tree is the limit of the density of the sub-trees of the root r (if it 
exists): 

lim dr{n) 

n — ^oo 

• The rooted average density of the tree the Cesaro limit of these densities: 

1 " 
lim — y^rfr(n) 

n^oo n ^ — ' 
i=l 

• The density of the tree is a if it has an identical rooted density for all node v: 

Vu : a = lim dy (n) 

n— >cx) 
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• The average density of the tree is a if it has an identical rooted average density for all node 
v: 

1 " 

: a = lim — > dv (n) 

n— »oo n ^ — ' 

i=l 

From the definition, we have the following direct implications: If a tree admits a density, then it 
admits an average density. In turn, a tree with a average density also has a rooted average density. 
Also, a tree with a density has a rooted density. 

Although the rooted definitions seem more natural and simple , the definition of general densities 
have the advantage that they do depend on the choice of the root. See Figure 4 for some examples. 
These examples will be further developed in the following section on rational trees. 




Figure 4: The first tree has a density of 1/2, the second one an average density equal to 1/2 but 
no density. The last one has a rooted density 1/2 but no average density. 



3 Rational trees 

A tree T is rational if the associated minimal multi-graph G(T) is finite. 

An example of a rational tree T is displayed in Figure 5 together with its graph G{T). Note 
that this tree is not irreducible. One final strongly component of period 2 (it corresponds to the 
alternating subtrees starting with ones and zeros displayed on the left) while the other one is 
aperiodic (it corresponds to the subtree with all its labels equal to one, displayed on the right). 




Figure 5: A rational tree made of two distinct subtrees and its associated multi-graph 
It is possible to characterize rational trees using their complexity. 
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Theorem 3.1. The following proposition are equivalent 

1. the tree T is rational; 

2. there exists n such that V{n) < n; 

3. there exists n such that V{n) = V{n + 1); 

4- There exists B such that for all n, V{n) < B. 
Proof. The proof of this results is similar to the proof for words. 

1 implies 2: If G(T) is finite, then the number of factors of size n in T is smaller than the size of 
G{T), therefore, there exists n such that ■p(n) < n. 

2 implies 3: Since 'P(l) = 2 and V{n) < n and since V is non-decreasing with n, there exists 

1 < fc < n such that V(k) = V{k + 1). 

3 implies 4: If V{n) = V{n + 1) = p then let us call by A", . . . all the distinct factors of size n in 
T. Since P(n + 1) = p, each A" is prolonged in a unique way into a tree of size n + 1, called A"^^. 
Now, each sub-tree A"^^ is composed of a root and d factors of size n, in the set {A^, . . . A^}. In 
turn, they are all prolonged into trees of size n in a unique way. Therefore, P{n + 2) = p. By a 
direct induction, Vik) = p for all k > n. 

4 implies 1: If the number of factors of size n is smaller than B for all n, then this means that the 
number of equivalence classes for =„ is smaller than n for all n, this means that G{T) has less than 
B nodes. □ 



3.1 Density of rational trees 

Let T be a rational tree and let G{T) be its minimal multi-graph. The nodes of G{T) are numbered 
Vi - ■ ■ ,Vk, with vi corresponding to the root of T. 

G{T) can be seen as the transition kernel of a Markov chain by considering each arc of GiT) as 
a transition with probability 1 /d. 

If G{T) is irreducible then the Markov chain admits a unique stationary measure tt on its nodes. 
The density of T and the stationary measure tt are related by the following theorem. 

Theorem 3.2. Let T he an irreducible rational tree with a minimal multigraph G{T) with K nodes. 

Let £ = {£i, . . . Ik) be the labels of the nodes of G{T) and let n = (tti, . . . jTTk) be the stationary 

measure over the nodes of G{T). 

IfTis aperiodic, then T admits a density a = irt . 

LfT is periodic with period p then T admits an average density a = n£* . 

Proof. Let be a Markov chain corresponding to G(T). Since G(T) is irreducible, Vn admits a 
unique stationary measure , say tt = (tti, . . . ,ttk)- Let us call P the kernel of this Markov chain: 
Pi J = a/d\i there are a arcs in G{T) from Vi to Vj. 

Now, let us consider all the paths of length n in T, starting from an arbitrary node Vi. By 
construction of G{T), the number of paths that end up in the nodes vi, - ■ • ,vk respectively, of 
G{T), is given by the vector eid^P", where Ci is the vector with all its coordinates equal to except 
the ith coordinate, equal to 1. 

Now, the number of ones in the tree of height n starting in Vi is hn{vi) = J2kZo d'^P'^t. 

Let us first consider the case where P is aperiodic. We denote by 11 the matrix with all its 
lines equal to the stationary measure, tt and by Dj. the matrix P*^ — 11. When P is aperiodic, then 
limfe^oo ll-C^fc||i = 0- Therefore, for all k > n, ||-D/c||i < e„ — > 0. 
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Then the density of ones d2n{vi) = Ji^li h2n{vi) can be estimated by splitting the factors of 
size 2n into a factor of size n at the root and d. factors of size n. One gets 



d2n{Vi) 



rf2 

d 



'^,e.±d^PH* + 4^e. 
fc=i 



fc=n+l 



d2' 



n 2n—l 2n 



fc=l 



fc=ri+l 



when n goes to infinity, the first term goes to because eiY^^^id'^P'^t < As for the 

second term ^inli ^i X^fe=^+i d^D^^f^ < ^si_i C?^"e?i- This goes to when n goes to infinity. 

As for the last term, 35^6^ Efc=^+i «^''nf* = ■^phziid?'' - rf"+2)(ein)^* this goes to tt^* when 
n goes to infinity. 

The same holds by computing the density of trees of size 2n + 1 by splitting them into the first 
n + 1 levels and the last n levels. 

This shows that the rooted density of all the trees in T is the same, equal to 7r£*. 

Let us now consider the case when the tree is periodic with period p. In that case, the kernel of 
p steps of the Markov chain can be put under the form 





r Pi 










pp — 





P2 

























p 



The sub-matrices Pi, ... , Pm are the kernels of aperiodic chains defined on a partition Si . . . Sm 
of the nodes of G{T). Let us denote by ai, . . . am the densities of the factors of size np, starting in 

Si . . . Sm, respectively (they exist because this has just been proved for aperiodic trees). 
Starting from a node v the average density of a tree of size n = pqn +rn, Vn < p is 



1 " 



1 



fe=0 



PQn + Tn 



1 



a=0 6=0 



PQv 



fe=0 



The first term goes to (ai + . . . + am)/iTi while the second term goes to zero, when n goes to 
infinity, independently of the root. Finally, (ai + . . . + amj/m = {'k'i^i + • • • + 7r^£^)/m = tt^* 
where tt^, • • • , tt^ are the stationary probability for the kernels Pi, ... , Pm and ti,..Am are the 
vectors of the labels in ^i . . . Sm- 

□ 

An example illustrating the computation of the density of an aperiodic irreducible rational tree 
is given in Figure 6. The stationary measure of the Markov chain is tt = (2/9, 3/9, 4/9). Therefore, 
the density is a = 2/9^i + 3/9^2 + 4/94 = 4/9. 

As for the reducible case, it should be easy to see that a rational tree may have different (average) 
densities for some of its subtrees (this is the case for the leftmost tree in Figure 4). Therefore, a 
reducible tree does not have a density nor an average density in general. 
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Figure 6: An irreducible aperiodic rational tree and its minimal graph. The stationary probabilities 
over the associated Markov chain are tt = (2/9, 3/9, 4/9). The density of the tree is a = 4/9. 

Let us call Si, - ■ ■ , the final strongly connected components of G{T). Let ai, . . . , be the 
average densities of the components Si ... , S„-, respectively. Finally, let R — (Ri ■ ■ ■ R^) be the 
probability of reaching the components Si ■ ■ ■ S.^ starting from the root Vi , in the Markov chain 
associated with G{T). Then, the following is true. 

Theorem 3.3. A rational tree always has a rooted average density a = (ai, . . .a^)/?*. 
Proof. If P is reducible, P can be decomposed into 



Q 


Ki 






and P" = 


r Q" 


K'l 




K' 

m 





Pi 










pn 































p 










pn 
m 



Considering all the paths in G(T) of length n, starting in the root, the number of paths ending 
in component Si is Ni{n) = c?"X]ieS£ ^li- decompose all the paths ending in S^ into two 

sub-paths: one (of length k) before entering S^ and one (of length n — k) inside 5^, we get from the 
decomposition of P", Ntiji) = d" Yl^=oi^^ 0, . . . , 0)Q''K(Ue, where is a vector whose coordinates 
are 1 in Si and everywhere else. 

The number of 1 in the rooted subtree of T of size 2n is the number of ones in all the paths of 
length n plus the number of ones in the subtrees of size 1. When n is large, the number of ones in 
the paths can be neglected with respect to the number of ones in the end trees. 

Finally, the number of one in a tree of size 2n is the number of ones in each possible end-tree 
of size n times the number of such trees, namely Ni(n). When n goes to infinity, the density of 
ones goes to J2e=i m'^^i^^^' • • • ~ Qy^K^ui = (ai • • -Qm)/?*, with Ri = (1,0, . . . ,0)(/ - 
Q)-^Kiui. " □ 

An example illustrating the computation of the rooted average density of a tree is given in Figure 
5. The graph G{T) has two final components, one aperiodic component with density 1 and another 

one with period 2 with average density 1/2. Starting from the root, both components are reached 
with probability 1/2. Therefore, such a tree has an average rooted density a = 1/2(1/2) -|- 1/2(1) = 
2/3. 

Also, it is not difficult to show that if all final component have a density (rather than an average 
density), then the tree has a rooted density, given by the same formula as in Theorem 3.3. 
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Finally, it is fairly straightforward to prove that since the K ■ K kernel P of the Markov chain 
associated with G{T) has all its elements of the form a/d, then the stationary probabilities tt as 
well as the average rooted density a of a rational tree are rational numbers of the form c/b with 
< c < 6 < This fact will be used in the algorithmic section 5 to make sure that the 

complexity of the algorithms does not depend on the size of the numbers. 

4 Balanced and Mechanical Trees 

In this section, we will introduce our most important definitions: strongly balanced and mechanical 
trees and explore the relations between them. In particular we will prove that in the case of 
irrational trees they represent the same set of trees, giving us a constructive representation of this 
class of trees. These results are very similar to the ones on words, which are summarized below. 

4.1 Sturmian, Balanced and Mechanical Words 

One definition of a Sturmian word uses the complexity of a word. The complexity of an infinite 
word t« is a function Vw : N — > N where Twin) is the number of distinct factors of length n of the 
word w. A word is periodic if there exists n such that Vw{n) < n. Sturmian words are aperiodic 
words with minimal complexity, i.e such that for any n: 

VUn)=n+l. (1) 

If a; is a factor of w, its height h(x) is the number of letters equal to 1 in a;. A balanced word is a 
word where the letters 1 are distributed as evenly as possible: 

Va;, y factors of w, \x\ = \y\ => \h{x) — h{y)\ < 1. (2) 

A mechanical word is constructed using integer parts of affine functions. Let a G [0; 1] and (f) G 
[0; 1). The lower (resp. upper) mechanical word of slope a and phase (j), w = 'W1W2 ■ ■ ■ (resp. 
w' = w'iw'2 ■ ■ ■ ) is defined by: 

These three definitions represent almost the same set of words. In the case of aperiodic words, 
they are equivalent: a word is Sturmian if and only if it is balanced and aperiodic if and only if it 
is mechanical of irrational slope. For periodic words, there are similar relations: 

• A rational mechanical word is balanced. 

• A periodic balanced word is ultimately mechanical. 

A word is an ultimately mechanical word if it can written as xw where a; is a finite word and w is 
a mechanical word. An example of a balanced word which is not mechanical (and just ultimately 
mechanical) is the infinite word with all letter and just one letter 1. For a more complete 
description of Sturmian words, we refer to [14]. 
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4.2 Balanced and strongly balanced trees 



Using the two definitions of factors of a tree, we define two notions of balance for trees: the first 
one and probably the most natural one, is what we call balanced trees and the other one is called 

strongly balanced trees. 

Definition 4.1 (Balanced and strongly balanced tree). A tree is balanced if for all n > 0, the 
number of nodes label by 1 in two factors of size n differ by at most 1 . 

A tree is strongly balanced if for all n,k > 0, the number of 1 in two factors of size n and width 
k differ by at most 1. 

As the name suggests, the strong balance property implies balance (by taking k = 1). In fact 
this notion is strictly stronger, see section 7 for an example of balanced tree that are not strongly 
balanced. 

Although the balance property is weaker and seems more natural for a generalization from words, 
our results will be mainly focused on strongly balanced trees that have almost the same properties 
as their counterparts on words. 



4.2.1 Density of a balanced tree 

For all node v and all size n, we denote by /i„(n) the number of 1 in the factor of root v of size 
n and d„(n) the density of this subtree is the number of ones divided by the cardinal S{n) of the 

factor: d^{n) =* ^^/?,,,(n). 

Proposition 4.1.1 (Density of balanced tree). A balanced tree has a density a. 
Moreover for all node v and for all size n: 

\K{n)-\_S{n)a\\<l (4) 

Proof. Let m„ be the minimal number of 1 in all factors of size n. As the tree is balanced, for all 
nodes v and n > 1: 

m„ < h-u{n) < m„ + 1 (5) 

Now let us consider a factor of size n + k and root v. It can be decomposed in a factor of size k 
of root V and d'' factors of size n at the leaves of the previous factor. The number of ones in these 
factors can be bounded by to„ and nik, therefore we have: 

nik + d'^run < m-a+k < mfe + 1 + rf''(m„ + 1) (6) 

The density of a factor of size n is < dy{n) = < "sjn) ■ Using these facts, we can 

bound dv{n + k) — dy{n): 

< dy{n + kj — du (n) < 



S{n + k) S{n) - ' ^ S{n + k) S{n) 

Using (6), the left inequality can be lower bounded by 

^d'^TUn + mk nin + l. ,mn + 'mk/d^ m„ - 



- 1 d^-l' ^ - l/# 



- S{n) 
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The same method can be used to prove that dv{n + k) — dy{n) < which shows that for n 
big enough, |(i„(n + k) — dv{n)\ is smaller than e, regardless of k. Thus dv{n) is a Cauchy sequence 
and has a limit a = lim„^oo This limit docs not depend on v and the tree has a density. 

Lets now prove that dy(n) — [S{n)a\\ < 1: dividing the inequality (6) by S{n,k) and taking 
the limit when k tends to 00 leads to: 

{d - l)m„ + a ^ < ~ ^)'fnn + 1 + a 
This shows that: S{n)a — 1 < to„ < S{n)a, which implies Equation (4). 

□ 

Similar ideas can be used to show that Equation (4) can be improved in the case of strongly 
balanced tree: for all width and size k,n> 1, the number of ones h{n, k) in a factor of size n and 
width k satisfies: 

\h{n,k)-[_S{n,k)a\\<l (7) 



4.3 Mechanical trees 

Building balanced tree is not that easy. According to formula (4) , each factors of size n must have 
\aS{n)\ or \aS{n)\ + 1 nodes one. This leads us to the following construction, inspired by the 

construction of mechanical words. 

Definition 4.2 (Mechanical tree). A tree is mechanical of density a € [0; 1] if for all node v, there 
exists a phase (j)v which satisfies one of the two following properties: 



Vn : hy{n) 



or Vn : hy(n) 



S{n)a + ( 



S{n)a 



In the first case, we say that (f)y is an inferior phase of v. In the second case, we say that 

a superior phase of v. 



(8) 



(9) 



IS 



This definition suggests that the phases of all nodes could be arbitrary. In fact, we will see that 
there exists a unique mechanical tree with a given phase at the root. The second question raised 

by this definition is the existence and uniqueness of the phase: we call (f>-u "a" phase of a node (py 
and not "the" phase of cpy since there may exist several phases leading to the same tree. 

We begin by a characterization of mechanical trees, given in the following formula: 

Proposition 4.2.1 (Characterization of mechanical trees). For each a G [0; 1] and <f) G [0; 1), there 
exists a unique mechanical tree of density a such that cf) is an inferior (resp. superior) phase of the 
root. 



Moreover, if (p is an inferior (resp. superior) phase of a node then 



< 



(resp. superior) phases of its d children, with 

, a + (l) + i-\a + (l)\ f 
(pi = I resp. 



< + \a — (j)] — a + i 
d 



< (pd-i are inferior 



(10) 
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The proof will be done in two steps. First we will see that if we define the phases as in (10) we 
have a mechanical tree, then we will see that this is the only way to do so. 

Proof. Existence. Let a G [0; 1] and 4> € [0; 1). We want to build a mechanical tree which root 
has an inferior phase cj) ( the case of a superior phase if similar and is not detailed here). Let A be 
an infinite tree. To each node v, we associate a number cp^ defined by: 



• If the phase of a node v is its d children satisfy Equation (10). 

Then we build a labeled tree by associating to each node v the label [a + 4>y\ . Let us prove by 
induction on n that the following relation holds. 



For all V : hv{n) 



S{n)a + 



(11) 



By definition of the labels, (11) holds when n = 1. Let n > and let us assume that (11) holds for 
n. Let w be a node with phase <j)y and let (j)o ■ ■ ■ be the phases of its children. We assume that 
a + (pv < 1 (which means that the label of the node is 0) a similar calculation can be done in the 
other case, « + > 1. 

Using the well-known formula YliZo + 3J = L'^^J > '^^ can compute /i„(n + 1): 



d-l 



hy{n+l) = ^[S'(n)a + </)^J 



d" - 1 a + + i 
a + 



i=0 

d-l 

d-l" ' d 

i=0 

= [S{n+l)a + (P\. 

Therefore, (11) holds for all n which means that the tree is mechanical. 

Uniqueness. Now, let ^ be a mechanical tree of density a. Let f be a node and (f>Q, . . . , (f>d-i 
be the phases of its children. Let i and j be two children and let hi{n) be the number of ones in the 
ith child subtree (of phase (pi). We want to prove that either for all n: hi{n) < hj{n) or for all n: 
hi{v) > hj{n). If the two nodes are both inferior (resp. superior), this is clearly true: hi(n) < hj{n) 
if and only if (pi < (pj (resp. (pi > (pj). If z is inferior and j is superior, it is not difficult to show 
that (pi <1 — (pj implies hi{n) < hj{n) and (pi >1 — (pj implies hi{n) > hj{n). 

Therefore we can assume (otherwise we exchange the order of the children) that for all n: 

ho{n) < hi{n) < ■ ■ ■ < hd-i{n). 

Moreover as hd-i{n) — ho{n) < 1, there exists k such that ho{n) = hi{n) = ■ ■ ■ = hk{n) < hk+i{n) = 
■ ■ ■ = hd-i{n). As X^^Jq^ hi{n) does not depend on cpQ, . . . , (pd-i, then for each n there is only one 
k that works and therefore there are only one possibility for hi(ri) for all n and all i. This implies 
that the tree with root v is unique 

As we have seen in the beginning of the proof, the phase (pi defined in (10) defines correct values 
for hi{-). Therefore such a phase (pi is a possible phase for the ith child. □ 
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This theorem shows that when the phase is fixed the tree is unique. The converse is false and 
one can find several phases that lead to the same tree (for example, when a = all phases define 
the tree with label everywhere) but we will show next that the set of densities a for which the 
phases are not necessarily unique has Lebesgue measure zero. 

If for all n, S{n)a + cf) ^N, then lS{n)a + (p\ = \S{n)a + c6 — 1] . In that case, if (f) is an inferior 
phase of a node then 1 — is a superior phase of the node. Therefore except particular cases, there 
exists at least two phases of a node: one inferior and one superior. Let us now look at the possible 
uniqueness of the inferior phase. 

Let us call frac(a;) the fractional part of a real number x and let us consider the sequence 
{frac(S'(n)a + <p')}ni^n- If this sequence can be arbitrary close to 0, this means that for all ip < <t>j 
there exists k such that [S{k)a + -(/jJ < [S{k)a + (f)\ and t/j can not be a phase of the tree. Also, if 
this sequence can be arbitrary close to 1, then one can show similarly that for all tp > (f), ip is not a 
phase of the node. Conversely, if the exists 6 > such that frac(5(n)a + </>)> 5 (resp. < 1 — ^) for 
all n, then let (/>' = (6 — e (resp. cf)' = (p + e), with e < 5. Then [S{n)a + (f)\ = lS{n)a + (i'J for all n. 

Finally, a phase (p is unique if and only if and 1 are accumulation points of the sequence 

Let us call x =^ a^iQJ and y'^= (p — x and let us consider the sequence 

frac(S'(n)a + (p) = frac(a;d"' — y). 

Let x\, . . . ,Xk, ■ ■ ■ (resp. yi, 2/2, • ■ • ) be the sequence of the digits of x (resp. y) in base d (also 
called the rf-decomposition). We have: 

n 00 

xdT -y = ^ XkcT-'' + J^i^k+n - yk)d-'' 

k=l fe=l 



frac(a;rf" — y) is arbitrarily close to implies that for arbitrarily big k, there exists n such that 

Xn,---,Xn+k-2=yi,---,yk, Xn+k-1 > Vn, (12) 

or 

frac(a;rf" -y) = 0. (13) 
Also, xd" — y is arbitrarily close to 1 implies that for arbitrarily big k, there exists n such that 

Xn,...,Xn+k-2=yi,---,yn-l, Xn+k-1 < Vn, (14) 

or the (i-dcvclopmcnt of y is finite (with only zeros after some point £ : y = yi, . . . ,yi,l,0,0 . . .) and 
that for arbitrarily big k, there exists n such that 

Xn,---, Xn+k-2 = 1/1, . . . , J/^, 0, 1, . . . , 1. (15) 

Using this characterization, three cases can be distinguished. 

• If is a number such that all finite sequences over 0, . . . , rf — 1 appear in its rf-decomposition, 

then every phase is unique. In particular, all normal numbers^ in base d verify this property 
and it is known that almost every number in [0, 1] is normal (sec [5] or [7]). 

number is normal in base d if all sequences of length k appear uniformly in its d-decomposition 
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• If a G Q, then the sequence frac(S'(A;)a + </>)) is periodic and there are no phase cf) such that 
<f) is unique. 

• If a is neither rational nor has the property that all binary sequences appear in a, then some 
(j) can be unique and some others may not. For example, for d = 2, if a is (in base 2) the 
number 

a = 0.101100111000111100001111100000 . . . , 

then if frac(a — (^) = 0, is unique (because a satisfies both properties 12 and 15). However 
(f)\ and (j)2 such that frac(a — = 0.10100 and frac(a — 02) = 0.1010 are equivalent (generate 
the same tree). 

Other examples of the same type are the rewind trees, drawn on figure 16. The sequence of 
digits in base 2 of the density of trees is a Sturmian word with irrational density. Half of 
the nodes of the tree are associated with node in the minimal graph and therefore could 
have the same phase whereas the phases computed using Equation 10 are not all the same. 
Therefore, phases are not unique here. 



4.3.1 Phases of a tree 

Let us call the set of numbers that can be phases of a node v and the set of the possible 
phases of a tree is the union of all possible phases of its nodes: $ = U{,$^. The set $ may be 
countable or uncountable. Countable for example when a/(d — 1) is normal since there are at most 
as many phases as nodes. Uncountable for example for the tree with all label 0, for which for each 
node, all phases in [0; 1) work. Nevertheless, the set of possible phases is dense is [0; 1). 

Indeed, at least all phases defined by the relation (10) are in If cf) is the phase of the root, 
then all nodes at level k have a phase which is the fractional part of: 

" + a + n 1 I 6 n .... 



d d' d*" rf*= rfi' 

with <ij < d for all j. Conversely all of these numbers are the phases of some node at level k. 
As k tends to infinity, by a proper choice of ii, . . . ,ik the fractional part of this number can be 

as close as possible to any number in [0; 1]. Thus the set of phases of the tree is dense in [0; 1]. 
If the density is J'j'+kl\k (with n + k minimal) one can show that the set of all possible phases for 

a given node is [^i^a; min C^ ^_i^ oi, 1)) for some m e 0, . . . , n + — 1. As <& is dense in [0; 1), it 
contains all of these intervals. Therefore, $ = [0; 1) and the tree has exactly n + k different factors 
of size greater than n + k. Hence its minimal graph has exactly n + k nodes. 



4.4 Equivalence between strongly balanced and mechanical trees 

As we have seen in section 4.1, there are strong relations between balanced and mechanical words. In 
this part, we will see that we can prove the same results between strongly balanced and mechanical 
trees. This result is formally stated in the following theorem. 

A tree is ultimately mechanical if all nodes (except finitely many) are mechanical {i.e. satisfies 
the equations 8 or 9). 

Theorem 4.1. The following statements are true. 
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(i) A mechanical tree is strongly balanced. 

(ii) An irrational strongly balanced tree is mechanical. 

(iii) A rational strongly balanced tree is ultimately mechanical. 

This theorem is the analog of the theorem linking balanced and mechanical words. We have seen 
that the word 0*^10°° is balanced but not mechanical, only ultimately mechanical. Its counterpart 
for trees would be a tree with all label equal to except for one node which has a label 1. The 
number 1 can be chosen as deep as desired, which shows that we can not bound the size of the 
"non-mechanical" beginning of the tree. A more complicated example is drawn Figure 8. 

Let us begin by the proof the first part of the theorem: 

Lemma 4.2.1. A mechanical tree is strongly balanced. 

Proof. Let n,k E N. For all node v, hy{n, k) is the number of 1 in the factor of size n and width k 
rooted in v. We want to prove that for all pairs of nodes v and v': \hy{n, k) — hyi{n. k)\ < 1. 

We assume that the nodes v and v' are inferior of phase and 0' (the proof with superior phases 
is similar). 

^n+k _^ d'' — 1 — 1 d'' — 1 

K{n,k)-K,{n,k) = L Q + '^J - L-^Tn~" + '^J ~ L a + (l)'\ + L"^^" + 

Using the well-known inequality x — x' — 1 < \x\ — \x' \ < x — x' + 1, one can show that 

-2 < (n, k) - hy> (n, k) < 2. 

As hy{n, k) and hyi{n, k) are integers, we have —1 < hy{n, k) — hyi{n, )k <1 which ends the proof 
of the lemma. □ 

We will see in the next section 4.5 that a tree is rational if and only if its density can be written 
as ^(f-^- (p, A:, n G N), therefore we will do the proof of theorem 4.1 distinguishing strongly balanced 
tree with density of this form or not. 

Lemma 4.2.2. // A is a strongly balanced tree of density a which can not be written as ^j^j^ 

(p, fc,n G then A is mechanical. 

Proof, let T be a real number and v a node. At least one of the two following properties is true: 

Vn > 1 : hy{n) < \_S{n)a + rj , (17) 
yn>l:hy{n)>lS{n)a + T\. (18) 

To prove this, assume that it is not true. Then there exists fc,Ti such that /?,,.(n) < lS(n)a + T\ and 
hv{k) > lS{k)a + t\. In that case the number of 1 in the factor of size n,n — k (or k,k — n if k > n) 
is hy{n) - hv{k) < [S{n)a + </>]- [S{k)a + (jj] - 2 < ^^rf-a - 1 which violates the formula (7). 
Let us define now the number (p as the minimum r that satisfies (17) 



inf |For ah n : hv{n) < [S'(n)a + rJ |. 
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For all T > (f), the equation (17) is true, while for all r' < </>, the equation (18) is true. This 
means that for all e > and all n: 



S{n)a + (t)-e-l< [S{n)a + (t)-e\< K{n) < [S'(n)a + + ej < S{n)a + (j) + e. (19) 
Taking the limit when e tends to shows that: 

S{n)a + (t)-l<K{n)<S{n)a + (t>. (20) 

Therefore, unless S{n)a + </) e N, h^{n) = [S{n)a + (f>\ = \S{n)a + - 1] . 

If there exists n G N such that S{n)a + G N, then there are no other /c G N (A; ^ n) such that 
S{k)a + G N - otherwise that would violate the condition a ^ { g(n k) 'P^^^'i ^ Therefore, 
for this particular n, cither hy{n) = S{n)a + (f) = lS{n)a + 4>\ in that case the node is inferior of 
phase 4> - OT hv{n) = S{n)a + </> — 1 = \S{n)a + 0—1] - in that case the node is superior of phase 
1-0. □ 

Lemma 4.2.3. Let A be a tree such that there exist n and k such that all factors of size (n, k) have 
the same number of nodes with label 1. Then the tree is mechanical. 

Proof. Let us take n and k satisfying the property, such that n + A; is minimal and let us call p is the 
common number of ones in the factors of size (n, k). Obviously, the tree as a density a = ^jt^^ttt^- 
Let V be the root of the tree. The same proof as in the irrational case can be used to establish 
that there exists such that 

S{n)a + — 1 < hy{n) < S{n)a + 0, 

and that the root is inferior of phase if there is no j such that hy{j) = ^fj-a + 0—1 and superior 

of phase 1 — if there is no i such that hy{i) = ^^j-o; + 0. Therefore the tree is mechanical unless 
there exist i and j satisfying these equalities. Let us show that if there exist such i and j, there is 

a contradiction. 

Let i = mmii{hy{i') = ^jEr'^ + 4>} a^^id j = u\m.y{hy{j') = "^^J^^ a + 0—1}. Either i < j or 
i > j, let us assume that j < i, the other case is similar. The number of ones in the factor of size 
i — j and width j is p' = '^jZ'i + 1. In that case wc have i > k + n, otherwise that would violate 
the minimal property of n + A;, li j — i > n the factor of size i — j and width j is composed of 
a factor of size i — n and width j and d^~"'~'' factors of size n and width k - that have exactly p 
nodes one as assumed in the previous paragraph - and then the number of 1 in this subtree is: 

hv{i) - - d'-"-''p + + 1 = a— ^— — +0+1, 

which violates the minimality of i. 

Then if all factors of size (fc, n) have exactly p nodes 1, the tree is mechanical. □ 

Lemma 4.2.4. If A is a strongly balanced tree with a density a = -g^;^ then it has at most n 

factors of size n, k with p + 1 ones. 

Proof. Using Eq. (7), each factor of size (n, k) has p—l,povp+l nodes labeled by 1. As the tree 
is strongly balanced, either there is no factor with p—1 ones or no factor with p+1 ones. Let us 
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Figure 7: A block of size (n' = in, k') is made of j blocks of size (n, k). 

assume that there is no factor with p — 1 ones (the other case is similar). We claim that there are 
at most n factors of size (n, k) with p + 1 nodes labeled by 1. 

Indeed, let / be a factor of size {n',k') with n' = in,i G N, A:' > k. This tree is composed of 
j blocks of size (n, k) (where j depends on i and A;', see Figure 7) and using Eq. (7) again, the 
number of nodes with label 1 is either jp — 1, jp or jp + 1. Therefore at most one of the (n, k) 
blocks has p + 1 nodes labeled by 1. 

Now, in the whole tree, if there were more than n + 1 blocks of size (n, k) with p + 1 ones, each 
of those blocks starting at line li, . . . and /n+i, there would bo two blocks with U = Ij mod n and 
the block of size Ij — k+n, k would have jp + 2 ones, which is not possible. Therefore there are at 
most n blocks of size n, k with p+1 nodes labelled by 1 in the whole tree. □ 

An example of a rational tree strongly balanced but not mechanical is presented in Figure 8. 




Figure 8: Example of a rational tree that is strongly balanced but not mechanical. On the left is 
the tree itself. In the middle the mechanical suffixes of the tree are displayed and its corresponding 

minimal graph (reducible) is displayed on the right. 

One can verify on the picture that the beginning of this tree is strongly balanced and as it continues 
with density exactly 1/3, the whole tree is strongly balanced. However this tree is ultimately 
mechanical but not mechanical since in a mechanical tree of density 1/3, all factors of size 2 should 
have [1 + =1 node labeled by one. 

Lemma 4.2.5. A strongly balanced tree with density a = is ultimately mechanical. Further- 

more, if the tree is irreducible, it is mechanical. 

Proof. Using Lemma 4.2.4, there are at most n factors of size n and width k with p+1 nodes 1, 
in the rest of the tree all factors of size (n, k) have exactly p ones. Then the tree is ultimately 
mechanical by Lemma 4.2.3. 



19 



If the tree is irreducible, a factor appears either or an infinite number of times. As there are at 
most n factors of size {k, n) with p + 1 nodes 1, there are no such factors and the tree is mechanical 
by Lemma 4.2.3. 

Note that this lemma concludes the proof of Theorem 4.1. □ 
4.5 Link with Sturmian trees 

In the case of words, Sturmian word are exactly the balanced (or mechanical) aperiodic words. The 
case of trees does not work as well since the Dyck Tree (Figure 3) and more generally all examples 
of Sturmian trees given in [4] are not balanced. However, the other implication holds as seen in the 
following theorem: 

Theorem 4.2. The following propositions are true. 

• A strongly balanced tree of density different from -g^^ for any p,n,k €N is Sturmian. 

• A strongly balanced tree of density ^(^-^ for any p,n,k gN is rational. 

This result has a simple implication: a strongly balanced tree is rational if and only if there 
exist p, n. A; G N such that its density is ^^j^- 

Proof. Let us consider the case of inferior mechanical trees (the superior case being similar). 

Let ^ be a mechanical tree of density a, let v be a node and let n > 0. According to Proposition 
4.2.1, the factor of size n only depends on the phase (pv of its root. In fact, one can show that this 
factor only depends on the values [^[Er^'^^vl ■ For alH > and </> e [0 : 1], we define the quantities 

fi{4>) '= [ '^Ji ct + 4>\ ■ The number of factors of size n only depends on the values fi{(j)), . . . , fn{<l>)- 
As seen in (16), the set of phases is dense in [0; 1], therefore they are exactly as many trees as 

tuples /i ((/)), . . . , fni4>) when (p £ [0; 1) by right-continuity of /». 

Each fi is an increasing functions taking integer values and hi{l) — hi{0) = 1. Thus there are at 

most n + 1 different tuples and then at most n+1 factors of size n and a mechanical tree is either 

rational or Sturmian. 

Moreover if a ^ | ^^^^ /p, n,k G n| , we neither have i j and ^Ey^i + (j^, ^[Ey*^ + (/> gN and 
then there are exactly n+1 factors of size n. 

If ct — then the number of factors of size n is at most n (see Section 4.3.1). Therefore 

the tree is rational using Theorem 3.1. 

If the the tree is not mechanical, then Theorem 4.1 says that the tree has density a = and 
is ultimately mechanical: There exists a depth D > 1 after which the tree is mechanical. Therefore, 
there are at most S{D) + n factors of any size (n in the mechanical children because of the value 
of a plus S{D) in the prefix sub- tree). In that case the tree is rational by Theorem 3.1. □ 

5 Algorithmic issues 

5.1 Testing if a rational tree is strongly balanced 

Given a finite description of a rational tree, let us consider the problem of checking whether this 
tree is balanced. An algorithm that works in time O(n^) where n is the number of vertices of the 
minimal graph of the tree is presented. 
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The first focus is on the description of the special structure of the minimal graph of a rational 
strongly balanced tree. Then an algorithm for irreducible rational trees is described as well as a 
sketch of the algorithm for the general case. 

5.1.1 Graph of rational strongly balanced trees 

Let us first consider a rational mechanical tree of density a. We know that there exist p,k,n > 
such that a = ^jrpTrr^- Using section 4.3.1, the minimal graph has exactly n + k nodes, and for any 
node, the set of all possible phases of all its descendants is [0; 1). Therefore, the graph is strongly 
connected and unique. The only difference between two rational mechanical trees of the same 
density is to which node the root of the tree is associated. Figure 9 displays several examples. The 
(unique) minimal graph of the mechanical trees of density 1/3, 1/7, 4/15 and 2/15 are displayed. 




Figure 9: These graphs represent all mechanical trees of density 1/3, 1/7, 4/15 and 6/15 = 2/5. 
For all graphs with n nodes, there are exactly n different mechanical trees of this particular density, 
depending on which node is associated to the root. Note that the first three graphs have a very 
similar structure (Figure 16 displays more mechanical trees with this structure). 

If the tree is strongly balanced but not mechanical, it is ultimately mechanical (see proposition 
4.2.5) which means that after a finite depth k, all suffixes are mechanical trees with the same 
density. All of these tree have the same graph, therefore the minimal graph has a unique final 
strongly connected component which is reached in at most k steps. Therefore, the minimal graph 
of a strongly balanced tree can be decomposed into a finite acyclic graph and one final strongly 
connected component, like in Figure 10. 




Strongly 
Connected 
Component 



Figure 10: General form of the graph of a reducible strongly balanced tree: an acyclic graph ending 
in a unique strongly connected component. 



21 



5.1.2 Irreducible trees 

Testing if two graphs with a given fixed out-degree are isomorphic can be done in polynomial time 
[15]. Therefore using the result shown in the previous section 5.1.1, an algorithm to test if a graph 
represents a mechanical tree can be obtained by computing the density a of the graph and testing 
if the graph is isomorphic to the graph of all mechanical trees with density a. However this is not 
very efficient and here we propose an algorithm that tests the balance property directly. 

Consider an irreducible rational tree A and let no be the number of vertices of its minimal graph. 
Theorem 4.2 says that it is strongly balanced if and only if it is mechanical. In that case its density 
is g^^^ for some p, fco € N and all sub-trees of size fco, no have exactly p nodes with label 1. Such 
factors will be called basic blocks in the following. 

Recall that the tree is strongly balanced if all factors of size (n, k) have \_aS{n, k)\ or \_aS{n, k) + 
IJ nodes of label one. We want to show that testing it for all n,k < no + ko is sufficient. 

Let i; be a node and n, fc > and let {F) be the number of labels 1 in the factor F of size 
(n, k) with root v. 

Starting from F, we construct a new factor F' by adding a new factor on top of F of size 
no, fc — no. This new factor can be partitioned into d^-'^a-^a basic blocks. The total factor F' is of 
size (n + no, k — no) and its number of ones is hy{F') = hy{F) + d^-^o-kop ^ggg Figure 11). 

The augmentation of the factor can be repeated until its size n', k' is such that k' < ko + no. Its 
number of ones is hy{F') = hy{F) + H where H does not depend on v. 




Figure 11: The first transformation: if fc > no + fco; we add a level of factors of size no, ko that all 
contain exactly p ones. The size of the factor becomes (n-|-no, A;— no). We repeat the transformation 
until the size is (n', k') with fc' < no -|- ko- In the figure, Tk stands for pdl^~^°~''°p. 

The second phase consists in building a new factor F" by removing a factor from F' of size 

no, fc' + n' — no. The removed part can be partitioned into d" -"o-feo basic blocks. Therefore the 
number of ones in F" is hy{F") = hv{F') —d" -no-kop_ "Yhis transformation is illustrated in Figure 
12. 

By repeating this transformation as long as n" > no + fco, we get a final factor F" whose size is 
(n", k") with n" < no + fco, k" < no + fco and whose number of ones is hy{F") = hy{F) + H — K, 
where H and K do not depend on v but only on n and k. 

Since hv{F) = hy{F") — H + K, it is enough to compute the number of ones in all factors with 
size [n" ,k") where n" < no + fco, k" < no + ko, to be able to obtain the number of ones in all 



22 



Figure 12: The second transformation: if n' > no + ko, we can remove a level of factors of size 
(no, ko). The size of the factor becomes (n' — no, k'). We repeat the transformation until the size 
is (n', k') with n' < no + ko (here, T„/ = pd"'"""-*^"). 

factors on any size. 

Also, it is enough to test if all factors with size (n", k") where n" < no + fco, k" < no + ko satisfy 
the strong balance property for all factors on any size to have the same property. 

There are at most n sub-trees of a given height and width. For i < m, let us call the 
number of 1 in the i^^ sub-tree of height i and width i + m. Let us call v{i) = {vi{i), . . . , Vd{i)) the 
set of the d children of the tree i. hi^e^m can be computed using the formula: 



These considerations yield the following algorithm 1. 



Algorithm 1 Testing if a rational tree is strongly balanced 
Require: Minimal graph G of a irreducible rational tree 
Ensure: The tree corresponding to G is strongly balanced 

N:= number of vertices of G 

Compute the density a of the Markov Chain 

if for all k:^^jEfa ^ N then 
return "not strongly balanced" 

end if 

for 1 < i,n,k < N do 

Compute hi,n^k according to (21) 




(21) 



d"-d' 
d-1 



,k 



aj + 1 then 



return "not strongly balanced' 
end if 
end for 

return "strongly balanced" 
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Solving the Markov chain to get a takes at most 0{N^) operations. Writing the density under 
the form ^k^_^k is hnear in N and computing all takes 0(A''^) operations using the formula 

(21). Therefore the algorithm runs in time 0{N^). 

5.1.3 General case 

The general case is more complicated since there can be some factors of size (tiq, fcg) with p + 1 (or 
p — 1) nodes labeled by 1. However the structure of the minimal graph of strongly balanced trees 
made in Section 5.1.1 can be useful. 

• Indeed, the minimal graph must have only one strongly connected component and it must 
corresponds to a strongly balanced tree. 

• If the density of the strongly connected component is 2"ocko ' '^^^ factors of size no, fco in the 
strongly component have exactly p nodes labeled by 1. 

Therefore, using the same techniques of reduction of the size as in Figure 11, one can show that 
we just have to test the balanced property for factors of size at most (n, n) where n is the number 
of vertices in the graph. 

5.2 Counting 

In this part, we address the problem of counting all possible factors of a mechanical tree. We will 
focus on trees of degree 2 and will compare this to the total number of possible factors of binary 
trees. 

There are 2" finite words of length n. Not all these words can be factors of a Sturmian words - 
for example 0011 can not be since it is not balanced. In fact, the number of factors of length n of 
Sturmian words is 

m 

1 + ^(m - i + l)<j){i) (22) 

i=l 

where (f> is the Euler function - </)(«) is the number of integers less than i and coprime with i. 
Asymptotically, this number is equal to to^/tt^. 

The number a„ of unordered complete binary trees of height n satisfies the equation: 

ttn+l = an{an + 1) (23) 

According to [17], there is no simple solution of this equation but using the method described 
in [1], one can show that a„ is the nearest integer close to 6'^ — 1/2, where 6 « 1.597910218 is the 
exponential of the rapidly convergent series ln(3/2) + X]n>o ^"^^^ + (•^'^" + 1)^^)- 

In section 4.5, we have seen that the number of factors of size n of a Sturmian tree is the number 
of tuple {fi{(j), a), . . . , /n('?f>, ct)) where fi{(j>, a) = [(2" — l)a + (f)\ . Let us call u„ this number. To 

Figure 13: Lines = (2" - 1)q! - i for n > and < i < 2" - 1 

count the number of these tuples, we draw the lines for which (2" — l)a — (/) G N, ( see Figure 13). 
The number of tuples is the number of different zones on this figure. 
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An exact count w„ is cumbersome to obtain but good bounds can be computed easily. Un+i—Un 
corresponds to the number of zones added by the adding the Hues a >—>■ (2"+^ — l)a — i. Each of 
these 2"+i - 1 Unes: 

• at least add a new zone if it only crosses other lines at points = or = 1. This is a very 
low estimate since it is only true for i = or i = 2" — 2, in the other cases it crosses at least 
the line a >—>■ (j). 

• at most add 1 + n zones if it crosses the n lines corresponding to a i— > {2^ — l)a — ij, 1 < j <n 
and if all these points are pairwise distinct. 

Therefore we have an estimation for all n > 2: 

2 + 2(2"+i - 3) < Un+i -Un<{n+ l)(2"+i - 1) (24) 
This leads to the bounds for n > 3: 

2" < u„ < (n - l)u„. (25) 

Improving these bounds seems difficult. To do so, one would have to count whether a "new" 
intersection has already been counted or if it is on the boundary (p — 0. By simulation, it seams 
that the number of trees is closer to n2" than to 2". 



6 Extremal properties 

In this section, we show that strongly balanced trees are extremal for certain convex cost functions 
that can be used in scheduling problems. 

Let g : M~*" M"*" be a convex function. Let us assume that g has a minimum in < a < 1. 

For each node v and each factor of size n rooted in v, , we define a cost of the factor rooted in 
u as a convex function of the density of ones: dy{n) = hy{n)/ S{n): 

Cv{n) ^= g{dy{n)). 

We can define a cost C*^ of order k of the total tree by considering all nodes in the sub tree of 
height i rooted in r, Ae as the Cezaro limit: 



def ^veAe ^v{k) 

^ = ^^rr s(i) • 



For each k, this cost is minimized when the number of 1 in a tree of height k is between S{k) 
and S{k) . This means that a strongly balanced tree will minimize any increasing function of all 
Cfe (for example the average value over all k). 

This has potential applications in optimization problem in distributed systems with a binary 
causal structure and would generalize some results in [2]. 

Consider for example a scheduling problem with two processors (with related speeds uq and ui) 
and an infinite set of tasks with a dependency pattern for forms a tree with degree d. Tasks at level 
k in the tree have size l/d'^ and are released 1 unit of time after their father. 
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Theorem 6.1. Under the ongoing assumptions, there exists an optimal density a G (0, 1) such that 
a strongly balanced tree with density a is the optimal allocation of tasks to processors and 1, in 

terms of average flowtime. 

Proof, (sketch) First, one should notice that at each second a load of one unit of work arrives in 
the system. Also note that every second, the allocation pattern forms a balanced sequence prefix 
with density a. Finally, up to level k, all tasks can bo seen as clusters of tasks of size l/rf*^. 

In [2], it is shown that the optimal allocation (for the average flow time) when tasks come in 
clusters of size m (for any m) is a balanced sequence over the clusters. 

Using a diagonal process over all sizes of clusters that up to level k show that strongly balanced 
trees are optimal up to level k. This is trus for any k. 

The end of the proof comes from taking the limit when k goes to infinity. □ 

Note that an arrival pattern that forms a tree of degree d may arise when tasks are generated 
by a recursive program. Actually, this result can be generalized to more general arrival patterns. 
If tasks at level k in the tree arc released after iid stochastic times (with an arbitrary distribution 
but expectation equal to one), yet again an allocation of the tasks to processors and 1 according 
to a strongly balanced tree is optimal, however the proof of this result is beyond the scope of this 
paper. 



7 Glossary 

The aim of this part is to give the big picture and to provide several examples of trees that are 
cither balanced, strongly balanced, reducible, irreducible, rational or Sturmian. In particular, we 
will give counter-examples that shows that the inclusions between these notions are strict. The 
Figure 14 illustrates these results. 



Balanced Trees 




Figure 14: Relations of inclusion linking the different definitions that we presented. Each number 
refers to an example detailed in section 7. For example 5 is the set of trees that are rational, 
reducible, ultimately mechanical, strongly balanced, balanced and neither mechanical nor Sturmian. 
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1. Reducible Sturmian tree that is not balanced - contrarily to the case of words where Sturmian 
words arc balanced, there exist Sturmian trees that are not balanced. The Dyck tree, see 

Figure 3, is one of them. 

2. Irreducible Sturmian trees that are not balanced - An example of a Sturmian tree that is 
irreducible (but not balanced) is the reflected random walk tree represented in Figure 15. It is 
Sturmian since the equivalence classes of the relation =„ are {0}, . . . , {n — 1}, {n, n+ 1, . . .}. 




Figure 15: The reflected random walk tree: each node of type n is followed by one of type n — 1 
and one of type n+1 (except for that is followed by and 1). 



3. Irreducible rational trees - see Figure 6. 

4. Reducible rational trees - see Figure 5. 

5. Rational reducible strongly balanced tree that is not m,echanical ~ strongly balanced tree are not 
necessarily mechanical in the case of reducible rational trees but only ultimately mechanical, 
see Figure 8 for an example. 

6. Reducible mechanical trees - let a be a normal number and consider the mechanical tree of 
density a and phase at the root. As a is normal, there is a unique phase corresponding to 
each node of order k which is the fractional part of: 

for a unique sequence i\,...,ik- One can show that if two sequences of ii, . . . , ife are different, 
then these phases are different, which shows that the minimal graph of the tree is exactly the 
tree itself. 

7. Irreducible mechanical trees - let w be a mechanical word and consider a graph with vertices 
{0, 1, . . . , }, where a node i > has label one if and only if Wi = 1. The node i has two 
outgoing arcs: one ending in i + 1, one ending in 0. We call this graph a restart tree since for 
a node n, we have the choice between restarting back in or continuing in n + 1, an example 
is displayed in Figure 16. 

As seen in Figure 17, the number of ones in a factor of size n that corresponds to the node i 
is 

hi{n) = H 1- Wi+n-i + ho{n - 1) H 1- ho{l), (27) 
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Figure 16: Example of the restart tree corresponding to the word aabaaab . . . 




Figure 17: Number of ones in a factor of the restart tree of size 5 



and the number of ones in a factor of size n and width k is 

hi{n, k) = hi{n) - hi{k) =Wk^ h Wj+n-i + ho{n - 1) H h ho{k). (28) 

Therefore the tree is strongly balanced if and only if the word w is balanced. Since the tree is 
irreducible, in that case the tree is also mechanical. Moreover we can show that for any word 
w the tree has a density which is lim„^oc = + ^ + ^ + • • • . 

Thus for any aperiodic balanced word, this gives us an example of irreducible irrational 
strongly balanced tree. 

8. Rational balanced tree that is not strongly balanced - An example of rational trees balanced 
but not strongly balanced is presented in Figure 18. On can show that all of its factors of size 
3 have exactly 4 nodes of label one. Using this fact, one can show that the number of ones in 
a factor of size 3n + i (0 < i < 3) rooted in a node j is: 



Size 


Node 1 


Node 2 


Node 3 


Node 4 


3n 

3n+ 1 


48"-! 
1+2.4^ 


^ 7 
+ 2.48 -1 


^ 7 
+ 2.48 -1 


48--1 
^ 7 
1+2.48 -1 


3n + 2 


1+4.4^ 


1 + 4.4^ 


2 + 4.48"-^ 


2+4.48"-^ 



This shows that the; trc;e is balanced. It is not strongly balanced since there are factors of 
size (1,1) with 2 nodes labeled by one and others with nodes labeled by one as seen in 
the bottom right part of figure 18. Also its minimal graph is not isomorphic to the unique 
minimal graph of a mechanical tree of density 4/7 that has only 3 nodes (see the discussion 
about graphs of strongly balanced tree section 5.1.1). 

9. Irrational balanced tree that is not strongly balanced - Biiilding an irrational tree not strongly 
balanced requires more work. We consider a tree that which has a root r labeled by and 
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Figure 18: A Rational Balanced Tree that is not strongly balanced 



two children that are mechanical trees of density a and respective phases </> and </> + a. We 
will see that under some conditions on a, (/) and a this will give us an example of an irrational 
tree that is balanced but not strongly balanced neither rational nor Sturmian. 



The two children of the root are balanced trees which means that the tree is balanced if and 
only if for all n: 

L(2"+i - l)aj < hr{n + 1) < [(2"+i - l)aj + 1 (29) 
Let us call k = [(2" - l)a + 0J and x = frac((2" - l)a + 

hr{n + l) = L(2" - l)a + <^J + L(2" - l)a + <^ + aj 
= k+[k + x + a} 

As (2"+i - l)a = 2A; + 2a; + a - 20, the equation 29 holds if for all x G [0; 1), we have: 

0<k+lk + x + a\-l2k + 2x + a-2(j)\<l 
which holds if for all a; G [0; 1): 

< [x + aj - [2a; + a - 20J < 1 

This equation is satisfied if and only if 

(a; + a < 1 and - 1 < 2a; - 20 + a < 1) or (a; + a > 1 and < 2a; - 2(?i + a < 2) 

Looking at the extremal cases for a; + a < 1 and a; + a > 1 which are a; = 0, 1 — a, 1, one gets 
4 relations: 

2(1 -a) -20 + a < 1 

-1 < -20 + a 
2 - 20 + a < 2 

< 2(1 -a) -20 + a. 
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Therefore the tree is balanced if and only if 

I <</>< <<^ + a< 1. (30) 

Moreover if a + (f)> 1 and 3a + < 2, the tree is not strongly balanced since its beginning is 



There are lots of triples a, (p, a satisfying conditions (30). For example a tree with a = ^ + e, 
(p = 0.6 and a = 0.2 where e S R \ Q with e small enough (for example e < 0.01 works since 
f « 0.21 < 0.6 < s±i « 0.71 < 0.8 < 1 and a + (/> > 1, 3a + ^ii « 1.9 < 2). 
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